R Introduction

A language for communicating with machines and people

Dr. Peng Zhao (✉ peng.zhao@xjtlu.edu.cn)

Department of Health and Environmental Sciences
Xi’an Jiaotong-Liverpool University

1 Learning objectives

  • Know what R can do
  • Set up the R/RStudio environment
  • Understand the way how R works
  • Basic operations in R

2 Pros & Cons of Statistical software

Software Difficulty Type Cost Usage Support Best for
Excel Easy GUI Cheap Wide Widespread Graphs
R Difficult Code Free Increasing Strongly online Cutting edge
SPSS Medium GUI Expensive Social Sci. Manual Statistics
SAS Difficult Code Expensive Decreasing Manual Complex

3 Installation of R

  • Online compiler

  • Main program (Mandatory): R

  • Integrated Development Environment (Highly recommended): RStudio

  • R Packages

install.packages(c("beginr", "ggplot2", "GGally", "ggplotgui", "learnr", "mindr", "MSG", "pinyin", "Rcmdr", "plotly", "remotes", "swirl"))
remotes::install_github("pzhaonet/fecitr")

4 What is R

R is a programming language for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. Created by statisticians Ross Ihaka and Robert Gentleman, R is used among data miners, bioinformaticians and statisticians for data analysis and developing statistical software. According to user surveys and studies of scholarly literature databases, R is one of the most commonly used programming languages used in data mining. As of October 2022, R ranks 12th in the TIOBE index, a measure of programming language popularity, in which the language peaked in 8th place in August 2020.

— Wikipedia: R (programming language), 2022-10-20

R is far more. It is a way you communicate with your computer.

5 What can R do

library(beginr)
plotcolors()

library(pinyin)
py('西交利物浦大学', dic = pydic())

library(mindr)
mm(c('# Pros', '# Cons'), root = 'R language')

6 Basic operation

6.1 RStudio IDE

Four Panes:

  • Script
  • Console
  • Environment: variables and functions
  • Files, folders, Graphs, Help.

Create an R project:

  • File - New project - New directory - New project
  • Organize your R scripts and products in a project:
  • Save your related files in the project folder
  • Always start your work from the .Rproj file

Create a script (.R):

  • File - New File - R Script. Hotkey ctrl+shift+N

Use hotkeys:

  1. ctrl + enter (run the line where the cursor is)
  2. F1 (get help)
  3. shift + alt + k (see all shortcuts)

6.2 Demo data

write.csv(iris, 'dat.csv', row.names = FALSE)

6.3 Import data

dat <- read.csv('dat.csv')

6.4 Statistics/Calculation

# mean and standard deviation
mean(dat$Sepal.Length)
sd(dat$Sepal.Length)

# more statistics
summary(dat)

# groups
tapply(dat$Sepal.Length, dat$Species, mean)
tapply(dat$Sepal.Length, dat$Species, sd)

# analysis of variance
xx <- aov(dat$Sepal.Length ~ dat$Species)
summary(xx)

# regression
mylm <- lm(dat$Petal.Width ~ dat$Petal.Length)
summary(mylm)

6.5 Graphs

plot(x = dat$Petal.Length, 
     y = dat$Petal.Width)
abline(mylm)

6.6 Packages

summary(dat)
  Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
 Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
 1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
 Median :5.800   Median :3.000   Median :4.350   Median :1.300  
 Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
 3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
 Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  
       Species  
 setosa    :50  
 versicolor:50  
 virginica :50  
                
                
                
library(fecitr)
plot_summary(dat, base = "hist", if_box = TRUE)

library(ggplot2)
ggplot(dat, aes(Petal.Length,Petal.Width))+ 
  geom_point() + 
  geom_smooth(method = "lm")

library(GGally)
ggpairs(dat, aes(color = Species, alpha = 0.1))
library(plotly)
ggpairs(dat, aes(color = Species, alpha = 0.1)) |> 
  ggplotly()

6.7 Export data

dat$new <- dat$Sepal.Length - mean(dat$Sepal.Length)
write.csv(dat, "dat2.csv")

6.8 GUI

library(Rcmdr)

library(ggplotgui)
ggplot_shiny()

7 Move forward

7.1 Partners

7.2 Help documents

demo(graphics)
demo(persp)
demo(image)
demo(plotmath)
demo(nlm)
demo(lm.glm)
demo(smooth)

# ggplot2
example(qplot)

# GGally
example(ggpairs)

# MSG
library(MSG)
demo(basketball)
demo(pointArts)
demo(gradArrows1) # Gradient descent method

7.3 R packages

library(swirl)

library(learnr)
run_tutorial("ex-data-basics", "learnr")

7.4 Books

7.5 Search engine

7.6 Forums

8 Further readings